digit recognition
Devanagari Digit Recognition using Quantum Machine Learning
Handwritten digit recognition in regional scripts, such as Devanagari, is crucial for multilingual document digitization, educational tools, and the preservation of cultural heritage. The script's complex structure and limited annotated datasets pose significant challenges to conventional models. This paper introduces the first hybrid quantum-classical architecture for Devanagari handwritten digit recognition, combining a convolu-tional neural network (CNN) for spatial feature extraction with a 10-qubit variational quantum circuit (VQC) for quantum-enhanced classification. Trained and evaluated on the Devanagari Handwritten Character Dataset (DHCD), the proposed model achieves a state-of-the-art test accuracy for quantum implementation of 99.80% and a test loss of 0.2893, with an average per-class F1-score of 0.9980. Compared to equivalent classical CNNs, our model demonstrates superior accuracy with significantly fewer parameters and enhanced robustness. By leveraging quantum principles such as superposition and entanglement, this work establishes a novel benchmark for regional script recognition, highlighting the promise of quantum machine learning (QML) in real-world, low-resource language settings.
Digits micro-model for accurate and secure transactions
Chhablani, Chirag, Sharma, Nikhita, Hosier, Jordan, Gurbani, Vijay K.
Automatic Speech Recognition (ASR) systems are used in the financial domain to enhance the caller experience by enabling natural language understanding and facilitating efficient and intuitive interactions. Increasing use of ASR systems requires that such systems exhibit very low error rates. The predominant ASR models to collect numeric data are large, general-purpose commercial models -- Google Speech-to-text (STT), or Amazon Transcribe -- or open source (OpenAI's Whisper). Such ASR models are trained on hundreds of thousands of hours of audio data and require considerable resources to run. Despite recent progress large speech recognition models, we highlight the potential of smaller, specialized "micro" models. Such light models can be trained perform well on number recognition specific tasks, competing with general models like Whisper or Google STT while using less than 80 minutes of training time and occupying at least an order of less memory resources. Also, unlike larger speech recognition models, micro-models are trained on carefully selected and curated datasets, which makes them highly accurate, agile, and easy to retrain, while using low compute resources. We present our work on creating micro models for multi-digit number recognition that handle diverse speaking styles reflecting real-world pronunciation patterns. Our work contributes to domain-specific ASR models, improving digit recognition accuracy, and privacy of data. An added advantage, their low resource consumption allows them to be hosted on-premise, keeping private data local instead uploading to an external cloud. Our results indicate that our micro-model makes less errors than the best-of-breed commercial or open-source ASRs in recognizing digits (1.8% error rate of our best micro-model versus 5.8% error rate of Whisper), and has a low memory footprint (0.66 GB VRAM for our model versus 11 GB VRAM for Whisper).
NeuroWrite: Predictive Handwritten Digit Classification using Deep Neural Networks
Asish, Kottakota, Teja, P. Sarath, Chander, R. Kishan, Hema, Dr. D. Deva
The rapid evolution of deep neural networks has revolutionized the field of machine learning, enabling remarkable advancements in various domains. In this article, we introduce NeuroWrite, a unique method for predicting the categorization of handwritten digits using deep neural networks. Our model exhibits outstanding accuracy in identifying and categorising handwritten digits by utilising the strength of convolutional neural networks (CNNs) and recurrent neural networks (RNNs).In this article, we give a thorough examination of the data preparation methods, network design, and training methods used in NeuroWrite. By implementing state-of-the-art techniques, we showcase how NeuroWrite can achieve high classification accuracy and robust generalization on handwritten digit datasets, such as MNIST. Furthermore, we explore the model's potential for real-world applications, including digit recognition in digitized documents, signature verification, and automated postal code recognition. NeuroWrite is a useful tool for computer vision and pattern recognition because of its performance and adaptability.The architecture, training procedure, and evaluation metrics of NeuroWrite are covered in detail in this study, illustrating how it can improve a number of applications that call for handwritten digit classification. The outcomes show that NeuroWrite is a promising method for raising the bar for deep neural network-based handwritten digit recognition.
Using a neural net to instantiate a deformable model
Deformable models are an attractive approach to recognizing non(cid:173) rigid objects which have considerable within class variability. How(cid:173) ever, there are severe search problems associated with fitting the models to data. We show that by using neural networks to provide better starting points, the search time can be significantly reduced. The method is demonstrated on a character recognition task. In previous work we have developed an approach to handwritten character recogni(cid:173) tion based on the use of deformable models (Hinton, Williams and Revow, 1992a; Revow, Williams and Hinton, 1993).
DIGITOUR: Automatic Digital Tours for Real-Estate Properties
Chhikara, Prateek, Kuhar, Harshul, Goyal, Anil, Sharma, Chirag
A virtual or digital tour is a form of virtual reality technology which allows a user to experience a specific location remotely. Currently, these virtual tours are created by following a 2-step strategy. First, a photographer clicks a 360 degree equirectangular image; then, a team of annotators manually links these images for the "walkthrough" user experience. The major challenge in the mass adoption of virtual tours is the time and cost involved in manual annotation/linking of images. Therefore, this paper presents an end-to-end pipeline to automate the generation of 3D virtual tours using equirectangular images for real-estate properties. We propose a novel HSV-based coloring scheme for paper tags that need to be placed at different locations before clicking the equirectangular images using 360 degree cameras. These tags have two characteristics: i) they are numbered to help the photographer for placement of tags in sequence and; ii) bi-colored, which allows better learning of tag detection (using YOLOv5 architecture) in an image and digit recognition (using custom MobileNet architecture) tasks. Finally, we link/connect all the equirectangular images based on detected tags. We show the efficiency of the proposed pipeline on a real-world equirectangular image dataset collected from the Housing.com database.
Automatic Sudoku (Number Place) Solver with Digit Recognition and Integer Linear Programming
Sudoku is a logic-based number placement puzzle that consists of 81 cells which are divided into 9 columns, rows and blocks. The goal of this game is to fill out each cells with numbers 1โ9 so that there are no repeating numbers in each row, column and blocks. In this post, I aim to introduce a digit recognition and integer linear programming based automatic sudoku solver that uses the following: Keras (based on the MNIST database [1]) and OpenCV for digit recognition and PuLP for integer linear programming. The database is also widely used for training and testing in the field of machine learning. In this section, I explain the overview of image processing for digit recognition.
AI Sudoku Solver
Sudoku is a puzzle in which players insert the numbers one to nine into a grid consisting of nine squares subdivided into a further nine smaller squares in such a way that every number appears once in each horizontal line, vertical line, and square. Using OpenCV, Deep Learning, and Backtracking Algorithm, We can solve the sudoku puzzle. First, build the Character Recognition model that can extract digits from a Sudoku grid image and then work on a backtracking approach to solve it. Deep Learning-based AI_Sudoku_Solver architecture uses OpenCV (opencv 4.2.0) and Python (python 3.7). The model Convolution Neural Network(CNN) uses Keras (keras 2.3.1) on Tensorflow for Digit Recognition.
Digit Recognition: A Beginner's Guide to Keras
Over the last decade, the use of artificial neural networks (ANNs) has increased considerably. People have used ANNs in medical diagnoses, to predict Bitcoin prices, and to create fake Obama videos! With all the buzz about deep learning and artificial neural networks, haven't you always wanted to create one for yourself? In this tutorial, we'll create a model to recognize handwritten digits We use the keras library for training the model in this tutorial. Keras is a high-level library in Python that is a wrapper over TensorFlow, CNTK and Theano.
Kickstart your experiments from examples - Azure Machine Learning Studio
For example, to browse experiments that use a PCA-based anomaly detection algorithm: Under Categories click Experiment. Then, under Algorithms Used, click Show all and in the dialog box choose PCA-Based Anomaly Detection. You may have to scroll to see it. For example, to find experiments contributed by Microsoft related to digit recognition that use a two-class support vector machine algorithm, enter "digit recognition" in the search box. For example, to browse experiments that use a PCA-based anomaly detection algorithm: Under Categories click Experiment.
Sparse Penalty in Deep Belief Networks: Using the Mixed Norm Constraint
Halkias, Xanadu, Paris, Sebastien, Glotin, Herve
Deep Belief Networks (DBN) have been successfully applied on popular machine learning tasks. Specifically, when applied on hand-written digit recognition, DBNs have achieved approximate accuracy rates of 98.8%. In an effort to optimize the data representation achieved by the DBN and maximize their descriptive power, recent advances have focused on inducing sparse constraints at each layer of the DBN. In this paper we present a theoretical approach for sparse constraints in the DBN using the mixed norm for both non-overlapping and overlapping groups. We explore how these constraints affect the classification accuracy for digit recognition in three different datasets (MNIST, USPS, RIMES) and provide initial estimations of their usefulness by altering different parameters such as the group size and overlap percentage.